首页> 外文OA文献 >Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems
【2h】

Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems

机译:加速循环归约:结构化线性系统的分布式内存快速求解器

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We present Accelerated Cyclic Reduction (ACR), a distributed-memory fast solver for rank-compressible block tridiagonal linear systems arising from the discretization of elliptic operators, developed here for three dimensions. Algorithmic synergies between Cyclic Reduction and hierarchical matrix arithmetic operations result in a solver that has O(kNlogN(logN+k2)) arithmetic complexity and O(k Nlog N) memory footprint, where N is the number of degrees of freedom and k is the rank of a block in the hierarchical approximation, and which exhibits substantial concurrency. We provide a baseline for performance and applicability by comparing with the multifrontal method with and without hierarchical semi-separable matrices, with algebraic multigrid and with the classic cyclic reduction method. Over a set of large-scale elliptic systems with features of nonsymmetry and indefiniteness, the robustness of the direct solvers extends beyond that of the multigrid solver, and relative to the multifrontal approach ACR has lower or comparable execution time and size of the factors, with substantially lower numerical ranks. ACR exhibits good strong and weak scaling in a distributed context and, as with any direct solver, is advantageous for problems that require the solution of multiple right-hand sides. Numerical experiments show that the rank k patterns are of O(1) for the Poisson equation and of O(n) for the indefinite Helmholtz equation. The solver is ideal in situations where low-accuracy solutions are sufficient, or otherwise as a preconditioner within an iterative method.
机译:我们提出了加速循环归约法(ACR),一种由椭圆算子离散化产生的秩可压缩块三对角线性系统的分布式内存快速求解器,在这里针对三个维度进行了开发。循环归约和层次矩阵算术运算之间的算法协同作用导致求解器具有O(kNlogN(logN + k2))算术复杂度和O(k Nlog N)内存占用量,其中N是自由度数,k是块近似在分层逼近中的等级,并且表现出相当大的并发性。通过与使用和不使用分层半可分离矩阵,代数多重网格和经典循环归约法的多面方法进行比较,我们为性能和适用性提供了基准。在一组具有非对称性和不确定性特征的大规模椭圆系统上,直接求解器的鲁棒性超出了多网格求解器的鲁棒性,并且相对于多前沿方法,ACR的执行时间和因素大小更低或相当。较低的数字等级。 ACR在分布式环境中表现出良好的强弱缩放,并且与任何直接求解器一样,对于需要解决多个右侧问题的问题很有用。数值实验表明,泊松方程的秩k为O(1),不定Helmholtz方程的秩为k(n)。在低精度解决方案就足够了的情况下,或者作为迭代方法中的前提条件,求解器是理想的选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号